Overview

Dataset statistics

Number of variables21
Number of observations3833
Missing cells59
Missing cells (%)0.1%
Duplicate rows500
Duplicate rows (%)13.0%
Total size in memory602.8 KiB
Average record size in memory161.0 B

Variable types

NUM16
BOOL3
CAT2

Reproduction

Analysis started2020-07-27 12:36:13.260460
Analysis finished2020-07-27 12:37:01.011011
Duration47.75 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

Dataset has 500 (13.0%) duplicate rows Duplicates
State has a high cardinality: 51 distinct values High cardinality
TotalMorCharge is highly correlated with TotalMorMinHigh correlation
TotalMorMin is highly correlated with TotalMorChargeHigh correlation
TotalEveCharge is highly correlated with TotalEveMinHigh correlation
TotalEveMin is highly correlated with TotalEveChargeHigh correlation
TotalNightCharge is highly correlated with TotalNightMinHigh correlation
TotalNightMin is highly correlated with TotalNightChargeHigh correlation
TotalIntCharge is highly correlated with TotalIntMinutesHigh correlation
TotalIntMinutes is highly correlated with TotalIntChargeHigh correlation
NumEmailMessages has 2753 (71.8%) zeros Zeros
CustomerServiceCalls has 801 (20.9%) zeros Zeros

Variables

State
Categorical

HIGH CARDINALITY

Distinct count51
Unique (%)1.3%
Missing0
Missing (%)0.0%
Memory size29.9 KiB
WV
 
123
MN
 
96
AL
 
94
NY
 
93
WY
 
92
Other values (46)
3335
ValueCountFrequency (%) 
WV1233.2%
 
MN962.5%
 
AL942.5%
 
NY932.4%
 
WY922.4%
 
WI922.4%
 
OR902.3%
 
VT882.3%
 
OH872.3%
 
VA862.2%
 
Other values (41)289275.5%
 

Length

Max length2
Median length2
Mean length2
Min length2

AccountLength
Real number (ℝ≥0)

Distinct count212
Unique (%)5.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.69762588051135
Minimum1
Maximum243
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum1
5-th percentile35
Q173
median100
Q3127
95-th percentile167
Maximum243
Range242
Interquartile range (IQR)54

Descriptive statistics

Standard deviation39.87235815
Coefficient of variation (CV)0.3959612534
Kurtosis-0.1284573965
Mean100.6976259
Median Absolute Deviation (MAD)27
Skewness0.1106040285
Sum385974
Variance1589.804945
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
87521.4%
 
105471.2%
 
90461.2%
 
93451.2%
 
101451.2%
 
95421.1%
 
86421.1%
 
112411.1%
 
99411.1%
 
92411.1%
 
Other values (202)339188.5%
 
ValueCountFrequency (%) 
190.2%
 
21< 0.1%
 
360.2%
 
41< 0.1%
 
520.1%
 
ValueCountFrequency (%) 
24320.1%
 
2321< 0.1%
 
22520.1%
 
22420.1%
 
2211< 0.1%
 

AreaCode
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size29.9 KiB
415
1903
510
976
408
954
ValueCountFrequency (%) 
415190349.6%
 
51097625.5%
 
40895424.9%
 

Length

Max length3
Median length3
Mean length3
Min length3

PhoneNumber
Real number (ℝ≥0)

Distinct count3333
Unique (%)87.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3746407.146099661
Minimum3271058
Maximum4229964
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum3271058
5-th percentile3323632
Q13506473
median3749107
Q33988385
95-th percentile4174597.6
Maximum4229964
Range958906
Interquartile range (IQR)481912

Descriptive statistics

Standard deviation275881.3736
Coefficient of variation (CV)0.07363891934
Kurtosis-1.232389722
Mean3746407.146
Median Absolute Deviation (MAD)240763
Skewness0.008673364768
Sum1.435997859e+10
Variance7.611053232e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
366591820.1%
 
327552520.1%
 
384183320.1%
 
402969120.1%
 
392934220.1%
 
350283220.1%
 
376590820.1%
 
357267920.1%
 
334328920.1%
 
390887620.1%
 
Other values (3323)381399.5%
 
ValueCountFrequency (%) 
32710581< 0.1%
 
327131920.1%
 
327305320.1%
 
327358720.1%
 
32738501< 0.1%
 
ValueCountFrequency (%) 
42299641< 0.1%
 
42283441< 0.1%
 
42283331< 0.1%
 
42282681< 0.1%
 
42277281< 0.1%
 
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size29.9 KiB
No
3459
Yes
 
374
ValueCountFrequency (%) 
No345990.2%
 
Yes3749.8%
 
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size29.9 KiB
No
2753
Yes
1080
ValueCountFrequency (%) 
No275371.8%
 
Yes108028.2%
 

NumEmailMessages
Real number (ℝ≥0)

ZEROS

Distinct count46
Unique (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.227758935559613
Minimum0
Maximum51
Zeros2753
Zeros (%)71.8%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q320
95-th percentile36
Maximum51
Range51
Interquartile range (IQR)20

Descriptive statistics

Standard deviation13.72443692
Coefficient of variation (CV)1.668065025
Kurtosis-0.1262028895
Mean8.227758936
Median Absolute Deviation (MAD)0
Skewness1.234682156
Sum31537
Variance188.3601687
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0275371.8%
 
31752.0%
 
29621.6%
 
28601.6%
 
33541.4%
 
24521.4%
 
26521.4%
 
27521.4%
 
30501.3%
 
32481.3%
 
Other values (36)57515.0%
 
ValueCountFrequency (%) 
0275371.8%
 
41< 0.1%
 
820.1%
 
920.1%
 
101< 0.1%
 
ValueCountFrequency (%) 
5120.1%
 
5020.1%
 
491< 0.1%
 
4820.1%
 
4730.1%
 

TotalMorMin
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count1661
Unique (%)43.7%
Missing30
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean180.07851696029454
Minimum0.0
Maximum350.8
Zeros2
Zeros (%)0.1%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile89.82
Q1144
median179.9
Q3216.65
95-th percentile271.1
Maximum350.8
Range350.8
Interquartile range (IQR)72.65

Descriptive statistics

Standard deviation54.66461122
Coefficient of variation (CV)0.3035598701
Kurtosis-0.01402366085
Mean180.078517
Median Absolute Deviation (MAD)36.3
Skewness-0.02261846973
Sum684838.6
Variance2988.21972
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
154110.3%
 
159.5100.3%
 
153.590.2%
 
146.380.2%
 
215.680.2%
 
209.980.2%
 
175.480.2%
 
174.580.2%
 
142.380.2%
 
183.480.2%
 
Other values (1651)371797.0%
 
(Missing)300.8%
 
ValueCountFrequency (%) 
020.1%
 
2.61< 0.1%
 
7.81< 0.1%
 
7.91< 0.1%
 
12.51< 0.1%
 
ValueCountFrequency (%) 
350.81< 0.1%
 
346.820.1%
 
345.31< 0.1%
 
337.41< 0.1%
 
335.520.1%
 

TotalMorCalls
Real number (ℝ≥0)

Distinct count119
Unique (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.36029219932168
Minimum0
Maximum165
Zeros2
Zeros (%)0.1%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile67
Q187
median101
Q3114
95-th percentile133
Maximum165
Range165
Interquartile range (IQR)27

Descriptive statistics

Standard deviation20.05586819
Coefficient of variation (CV)0.1998386787
Kurtosis0.2069223211
Mean100.3602922
Median Absolute Deviation (MAD)13
Skewness-0.0932414474
Sum384681
Variance402.2378488
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
102882.3%
 
105852.2%
 
107842.2%
 
88842.2%
 
97832.2%
 
110812.1%
 
104802.1%
 
95792.1%
 
108782.0%
 
112762.0%
 
Other values (109)301578.7%
 
ValueCountFrequency (%) 
020.1%
 
301< 0.1%
 
351< 0.1%
 
3620.1%
 
4030.1%
 
ValueCountFrequency (%) 
1651< 0.1%
 
1631< 0.1%
 
1601< 0.1%
 
15830.1%
 
1571< 0.1%
 

TotalMorCharge
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count1667
Unique (%)43.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.61445603965562
Minimum0.0
Maximum59.64
Zeros2
Zeros (%)0.1%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile15.262
Q124.48
median30.6
Q336.82
95-th percentile46.094
Maximum59.64
Range59.64
Interquartile range (IQR)12.34

Descriptive statistics

Standard deviation9.292340501
Coefficient of variation (CV)0.3035278657
Kurtosis-0.01138899956
Mean30.61445604
Median Absolute Deviation (MAD)6.15
Skewness-0.02466342212
Sum117345.21
Variance86.34759199
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
26.18110.3%
 
27.12100.3%
 
26.190.2%
 
24.8780.2%
 
24.1980.2%
 
29.8280.2%
 
31.1880.2%
 
36.6580.2%
 
35.6880.2%
 
27.5980.2%
 
Other values (1657)374797.8%
 
ValueCountFrequency (%) 
020.1%
 
0.441< 0.1%
 
1.331< 0.1%
 
1.341< 0.1%
 
2.131< 0.1%
 
ValueCountFrequency (%) 
59.641< 0.1%
 
58.9620.1%
 
58.71< 0.1%
 
57.361< 0.1%
 
57.0420.1%
 

TotalEveMin
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count1609
Unique (%)42.1%
Missing11
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean200.68592360020935
Minimum0.0
Maximum363.7
Zeros1
Zeros (%)< 0.1%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile118.01
Q1165.8
median201
Q3235.3
95-th percentile285.195
Maximum363.7
Range363.7
Interquartile range (IQR)69.5

Descriptive statistics

Standard deviation51.18062928
Coefficient of variation (CV)0.2550284961
Kurtosis0.03312453356
Mean200.6859236
Median Absolute Deviation (MAD)34.9
Skewness-0.03288100885
Sum767021.6
Variance2619.456813
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
169.9110.3%
 
202.690.2%
 
167.980.2%
 
22380.2%
 
219.180.2%
 
211.780.2%
 
195.580.2%
 
161.780.2%
 
220.680.2%
 
224.780.2%
 
Other values (1599)373897.5%
 
(Missing)110.3%
 
ValueCountFrequency (%) 
01< 0.1%
 
31.21< 0.1%
 
42.220.1%
 
42.520.1%
 
43.920.1%
 
ValueCountFrequency (%) 
363.71< 0.1%
 
361.81< 0.1%
 
354.21< 0.1%
 
351.61< 0.1%
 
350.91< 0.1%
 

TotalEveCalls
Real number (ℝ≥0)

Distinct count123
Unique (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.12444560396557
Minimum0
Maximum170
Zeros1
Zeros (%)< 0.1%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile66
Q187
median101
Q3114
95-th percentile133
Maximum170
Range170
Interquartile range (IQR)27

Descriptive statistics

Standard deviation19.93059224
Coefficient of variation (CV)0.1990582032
Kurtosis0.2446432281
Mean100.1244456
Median Absolute Deviation (MAD)13
Skewness-0.09629527441
Sum383777
Variance397.2285072
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
105942.5%
 
94902.3%
 
108832.2%
 
102832.2%
 
101812.1%
 
97792.1%
 
88772.0%
 
103772.0%
 
109772.0%
 
111752.0%
 
Other values (113)301778.7%
 
ValueCountFrequency (%) 
01< 0.1%
 
1220.1%
 
361< 0.1%
 
3720.1%
 
421< 0.1%
 
ValueCountFrequency (%) 
1701< 0.1%
 
1681< 0.1%
 
1641< 0.1%
 
1591< 0.1%
 
1571< 0.1%
 

TotalEveCharge
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count1440
Unique (%)37.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.05778241586225
Minimum0.0
Maximum30.91
Zeros1
Zeros (%)< 0.1%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile10.042
Q114.09
median17.09
Q320
95-th percentile24.234
Maximum30.91
Range30.91
Interquartile range (IQR)5.91

Descriptive statistics

Standard deviation4.346912906
Coefficient of variation (CV)0.2548345852
Kurtosis0.03489822183
Mean17.05778242
Median Absolute Deviation (MAD)2.96
Skewness-0.03232485752
Sum65382.48
Variance18.89565181
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
18.62120.3%
 
16.12120.3%
 
18.96110.3%
 
14.44110.3%
 
14.25110.3%
 
15.9110.3%
 
17.99110.3%
 
17.0990.2%
 
16.1890.2%
 
16.9790.2%
 
Other values (1430)372797.2%
 
ValueCountFrequency (%) 
01< 0.1%
 
2.651< 0.1%
 
3.5920.1%
 
3.6120.1%
 
3.7320.1%
 
ValueCountFrequency (%) 
30.911< 0.1%
 
30.751< 0.1%
 
30.111< 0.1%
 
29.891< 0.1%
 
29.831< 0.1%
 

TotalNightMin
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count1589
Unique (%)41.7%
Missing18
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean200.64432503276538
Minimum23.2
Maximum395.0
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum23.2
5-th percentile117.97
Q1166.9
median200.9
Q3235.25
95-th percentile283.26
Maximum395
Range371.8
Interquartile range (IQR)68.35

Descriptive statistics

Standard deviation50.79240478
Coefficient of variation (CV)0.2531464808
Kurtosis0.07705348271
Mean200.644325
Median Absolute Deviation (MAD)34.1
Skewness0.00538238217
Sum765458.1
Variance2579.868383
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
190.5100.3%
 
197.4100.3%
 
231.590.2%
 
214.690.2%
 
188.280.2%
 
221.680.2%
 
21080.2%
 
205.180.2%
 
191.480.2%
 
198.580.2%
 
Other values (1579)372997.3%
 
(Missing)180.5%
 
ValueCountFrequency (%) 
23.21< 0.1%
 
43.71< 0.1%
 
4520.1%
 
47.41< 0.1%
 
50.120.1%
 
ValueCountFrequency (%) 
3951< 0.1%
 
381.91< 0.1%
 
377.51< 0.1%
 
367.71< 0.1%
 
364.91< 0.1%
 

TotalNightCalls
Real number (ℝ≥0)

Distinct count120
Unique (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.21627967649361
Minimum33
Maximum175
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum33
5-th percentile68
Q187
median100
Q3114
95-th percentile132
Maximum175
Range142
Interquartile range (IQR)27

Descriptive statistics

Standard deviation19.54282878
Coefficient of variation (CV)0.1950065283
Kurtosis-0.1090674252
Mean100.2162797
Median Absolute Deviation (MAD)13
Skewness0.03654890648
Sum384129
Variance381.9221566
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
105982.6%
 
104912.4%
 
91892.3%
 
102822.1%
 
106782.0%
 
103772.0%
 
100772.0%
 
94762.0%
 
92762.0%
 
98741.9%
 
Other values (110)301578.7%
 
ValueCountFrequency (%) 
331< 0.1%
 
361< 0.1%
 
381< 0.1%
 
4220.1%
 
441< 0.1%
 
ValueCountFrequency (%) 
1751< 0.1%
 
1661< 0.1%
 
1641< 0.1%
 
15820.1%
 
15720.1%
 

TotalNightCharge
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count933
Unique (%)24.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.029924341247064
Minimum1.04
Maximum17.77
Zeros0
Zeros (%)0.0%
Memory size29.9 KiB

Quantile statistics

Minimum1.04
5-th percentile5.31
Q17.52
median9.05
Q310.58
95-th percentile12.734
Maximum17.77
Range16.73
Interquartile range (IQR)3.06

Descriptive statistics

Standard deviation2.282705428
Coefficient of variation (CV)0.2527934168
Kurtosis0.0810045426
Mean9.029924341
Median Absolute Deviation (MAD)1.53
Skewness0.004077067018
Sum34611.7
Variance5.21074407
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
8.88170.4%
 
9.66160.4%
 
8.57160.4%
 
9.48150.4%
 
8.47150.4%
 
9.23150.4%
 
9.45150.4%
 
7.15140.4%
 
7.69140.4%
 
6.48130.3%
 
Other values (923)368396.1%
 
ValueCountFrequency (%) 
1.041< 0.1%
 
1.971< 0.1%
 
2.0320.1%
 
2.131< 0.1%
 
2.2520.1%
 
ValueCountFrequency (%) 
17.771< 0.1%
 
17.191< 0.1%
 
16.991< 0.1%
 
16.551< 0.1%
 
16.421< 0.1%
 

TotalIntMinutes
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count162
Unique (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.24124706496217
Minimum0.0
Maximum20.0
Zeros21
Zeros (%)0.5%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile5.7
Q18.5
median10.3
Q312.1
95-th percentile14.7
Maximum20
Range20
Interquartile range (IQR)3.6

Descriptive statistics

Standard deviation2.800543716
Coefficient of variation (CV)0.2734572946
Kurtosis0.5978019651
Mean10.24124706
Median Absolute Deviation (MAD)1.8
Skewness-0.2372642836
Sum39254.7
Variance7.843045104
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
11.3681.8%
 
10651.7%
 
11641.7%
 
9.8631.6%
 
11.4631.6%
 
10.2611.6%
 
10.9611.6%
 
9.2601.6%
 
10.1601.6%
 
10.6601.6%
 
Other values (152)320883.7%
 
ValueCountFrequency (%) 
0210.5%
 
1.11< 0.1%
 
1.320.1%
 
220.1%
 
2.120.1%
 
ValueCountFrequency (%) 
201< 0.1%
 
18.91< 0.1%
 
18.41< 0.1%
 
18.31< 0.1%
 
18.230.1%
 

TotalIntCalls
Real number (ℝ≥0)

Distinct count21
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.495173493347248
Minimum0
Maximum20
Zeros21
Zeros (%)0.5%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median4
Q36
95-th percentile9
Maximum20
Range20
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.476094511
Coefficient of variation (CV)0.5508340255
Kurtosis3.153150931
Mean4.495173493
Median Absolute Deviation (MAD)1
Skewness1.348623378
Sum17230
Variance6.131044027
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
377020.1%
 
471818.7%
 
255414.5%
 
554614.2%
 
638810.1%
 
72436.3%
 
11804.7%
 
81403.7%
 
91243.2%
 
10531.4%
 
Other values (11)1173.1%
 
ValueCountFrequency (%) 
0210.5%
 
11804.7%
 
255414.5%
 
377020.1%
 
471818.7%
 
ValueCountFrequency (%) 
201< 0.1%
 
191< 0.1%
 
1840.1%
 
171< 0.1%
 
1620.1%
 

TotalIntCharge
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count162
Unique (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7656561440125227
Minimum0.0
Maximum5.4
Zeros21
Zeros (%)0.5%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile1.54
Q12.3
median2.78
Q33.27
95-th percentile3.97
Maximum5.4
Range5.4
Interquartile range (IQR)0.97

Descriptive statistics

Standard deviation0.7561383417
Coefficient of variation (CV)0.2734028752
Kurtosis0.5982886059
Mean2.765656144
Median Absolute Deviation (MAD)0.49
Skewness-0.237328036
Sum10600.76
Variance0.5717451918
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
3.05681.8%
 
2.7651.7%
 
2.97641.7%
 
2.65631.6%
 
3.08631.6%
 
2.94611.6%
 
2.75611.6%
 
2.86601.6%
 
2.73601.6%
 
2.48601.6%
 
Other values (152)320883.7%
 
ValueCountFrequency (%) 
0210.5%
 
0.31< 0.1%
 
0.3520.1%
 
0.5420.1%
 
0.5720.1%
 
ValueCountFrequency (%) 
5.41< 0.1%
 
5.11< 0.1%
 
4.971< 0.1%
 
4.941< 0.1%
 
4.9130.1%
 

CustomerServiceCalls
Real number (ℝ≥0)

ZEROS

Distinct count10
Unique (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.55987477171928
Minimum0
Maximum9
Zeros801
Zeros (%)20.9%
Memory size29.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.3197659
Coefficient of variation (CV)0.8460716999
Kurtosis1.915390665
Mean1.559874772
Median Absolute Deviation (MAD)1
Skewness1.132034302
Sum5979
Variance1.74178203
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1136935.7%
 
287022.7%
 
080120.9%
 
348912.8%
 
41844.8%
 
5772.0%
 
6270.7%
 
7110.3%
 
930.1%
 
820.1%
 
ValueCountFrequency (%) 
080120.9%
 
1136935.7%
 
287022.7%
 
348912.8%
 
41844.8%
 
ValueCountFrequency (%) 
930.1%
 
820.1%
 
7110.3%
 
6270.7%
 
5772.0%
 

Churn?
Boolean

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
False
3289
True
 
544
ValueCountFrequency (%) 
False328985.8%
 
True54414.2%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

StateAccountLengthAreaCodePhoneNumberInternationalPlan?VoiceMailPlan?NumEmailMessagesTotalMorMinTotalMorCallsTotalMorChargeTotalEveMinTotalEveCallsTotalEveChargeTotalNightMinTotalNightCallsTotalNightChargeTotalIntMinutesTotalIntCallsTotalIntChargeCustomerServiceCallsChurn?
0KS1284153824657NoYes25265.111045.07197.49916.78244.79111.0110.032.701False
1OH1074153717191NoYes26161.612327.47195.510316.62254.410311.4513.733.701False
2NJ1374153581921NoNo0243.411441.38121.211010.30162.61047.3212.253.290False
3OH844083759999YesNo0299.47150.9061.9885.26196.9898.866.671.782False
4OK754153306626YesNo0166.711328.34148.312212.61186.91218.4110.132.733False
5AL1185103918027YesNo0223.49837.98220.610118.75203.91189.186.361.700False
6MA1215103559993NoYes24218.28837.09348.510829.62212.61189.577.572.033False
7MO1474153299001YesNo0157.07926.69103.1948.76211.8969.537.161.920False
8LA1174083354719NoNo0184.59731.37351.68029.89215.8909.718.742.351False
9WV1414153308173YesYes37258.68443.96222.011118.87326.49714.6911.253.020False

Last rows

StateAccountLengthAreaCodePhoneNumberInternationalPlan?VoiceMailPlan?NumEmailMessagesTotalMorMinTotalMorCallsTotalMorChargeTotalEveMinTotalEveCallsTotalEveChargeTotalNightMinTotalNightCallsTotalNightChargeTotalIntMinutesTotalIntCallsTotalIntChargeCustomerServiceCallsChurn?
3823SC384153755439NoYes31197.211833.52249.97021.24298.910413.453.921.050False
3824MI504153613779NoYes35192.69732.74135.210111.49216.21019.737.922.132False
3825MI455103758934NoYes2691.710415.59150.611912.8063.31032.857.752.081False
3826TN705103954757NoNo0126.39921.47141.610612.04255.99611.529.622.590False
3827NY1475104217205NoYes33251.510742.76234.111019.90213.4879.6010.462.813False
3828NV945103798805NoNo0190.610832.40152.39512.95144.7976.517.552.031False
3829IL1795103482150NoNo0116.110119.74201.89917.15181.91038.1911.653.130False
3830MS1164154179128NoNo0217.39136.94216.19518.37148.1766.6611.333.052False
3831ND595103514226NoNo0179.48030.50232.59919.76175.81057.9114.733.970False
3832NC1654153306630NoNo0207.710935.31164.89414.0154.5912.457.932.130False

Duplicate rows

Most frequent

StateAccountLengthAreaCodePhoneNumberInternationalPlan?VoiceMailPlan?NumEmailMessagesTotalMorMinTotalMorCallsTotalMorChargeTotalEveMinTotalEveCallsTotalEveChargeTotalNightMinTotalNightCallsTotalNightChargeTotalIntMinutesTotalIntCallsTotalIntChargeCustomerServiceCallsChurn?count
0AK414153787733NoNo0159.36627.08125.97510.70261.97611.7911.153.001False2
1AK785104189385NoNo0190.38832.35194.58916.53256.510911.5411.753.162False2
2AK1084153305462NoNo0103.012917.51242.310320.60170.2897.667.932.131False2
3AK1104083962335NoNo0100.19017.02233.39319.83204.4579.2011.183.003False2
4AK1114153647719NoNo0172.85829.38183.110815.56158.81047.157.932.134True2
5AK1274083839255NoNo0182.312430.99169.911014.44184.01168.289.332.511False2
6AK1304153925587NoNo0242.510141.23102.81148.74142.4896.419.322.512False2
7AK1324153459153NoYes39175.79329.87187.29415.91225.511810.158.632.322False2
8AK1565103414075NoNo0123.79621.03103.0808.76189.4828.5213.143.541False2
9AL474084045387NoYes28141.39424.02168.010814.28113.5845.117.822.111False2